Goto

Collaborating Authors

 tree-structured parzen estimator


Opening the Black Box: Nowcasting Singapore's GDP Growth and its Explainability

Attolico, Luca

arXiv.org Artificial Intelligence

Timely assessment of current conditions is essential especially for small, open economies such as Singapore, where external shocks transmit rapidly to domestic activity. We develop a real-time nowcasting framework for quarterly GDP growth using a high-dimensional panel of approximately 70 indicators, encompassing economic and financial indicators over 1990Q1-2023Q2. The analysis covers penalized regressions, dimensionality-reduction methods, ensemble learning algorithms, and neural architectures, benchmarked against a Random Walk, an AR(3), and a Dynamic Factor Model. The pipeline preserves temporal ordering through an expanding-window walk-forward design with Bayesian hyperparameter optimization, and uses moving block-bootstrap procedures both to construct prediction intervals and to obtain confidence bands for feature-importance measures. It adopts model-specific and XAI-based explainability tools. A Model Confidence Set procedure identifies statistically superior learners, which are then combined through simple, weighted, and exponentially weighted schemes; the resulting time-varying weights provide an interpretable representation of model contributions. Predictive ability is assessed via Giacomini-White tests. Empirical results show that penalized regressions, dimensionality-reduction models, and GRU networks consistently outperform all benchmarks, with RMSFE reductions of roughly 40-60%; aggregation delivers further gains. Feature-attribution methods highlight industrial production, external trade, and labor-market indicators as dominant drivers of Singapore's short-run growth dynamics.


Evaluating the Efficiency of Latent Spaces via the Coupling-Matrix

Yavuz, Mehmet Can, Yanikoglu, Berrin

arXiv.org Artificial Intelligence

A central challenge in representation learning is constructing latent embeddings that are both expressive and efficient. In practice, deep networks often produce redundant latent spaces where multiple coordinates encode overlapping information, reducing effective capacity and hindering generalization. Standard metrics such as accuracy or reconstruction loss provide only indirect evidence of such redundancy and cannot isolate it as a failure mode. We introduce a redundancy index, denoted rho(C), that directly quantifies inter-dimensional dependencies by analyzing coupling matrices derived from latent representations and comparing their off-diagonal statistics against a normal distribution via energy distance. The result is a compact, interpretable, and statistically grounded measure of representational quality. We validate rho(C) across discriminative and generative settings on MNIST variants, Fashion-MNIST, CIFAR-10, and CIFAR-100, spanning multiple architectures and hyperparameter optimization strategies. Empirically, low rho(C) reliably predicts high classification accuracy or low reconstruction error, while elevated redundancy is associated with performance collapse. Estimator reliability grows with latent dimension, yielding natural lower bounds for reliable analysis. We further show that Tree-structured Parzen Estimators (TPE) preferentially explore low-rho regions, suggesting that rho(C) can guide neural architecture search and serve as a redundancy-aware regularization target. By exposing redundancy as a universal bottleneck across models and tasks, rho(C) offers both a theoretical lens and a practical tool for evaluating and improving the efficiency of learned representations.


Tree-Structured Parzen Estimator Can Solve Black-Box Combinatorial Optimization More Efficiently

Abe, Kenshin, Wang, Yunzhuo, Watanabe, Shuhei

arXiv.org Artificial Intelligence

Tree-structured Parzen estimator (TPE) is a versatile hyperparameter optimization (HPO) method supported by popular HPO tools. Since these HPO tools have been developed in line with the trend of deep learning (DL), the problem setups often used in the DL domain have been discussed for TPE such as multi-objective optimization and multi-fidelity optimization. However, the practical applications of HPO are not limited to DL, and black-box combinatorial optimization is actively utilized in some domains, e.g., chemistry and biology. As combinatorial optimization has been an untouched, yet very important, topic in TPE, we propose an efficient combinatorial optimization algorithm for TPE. In this paper, we first generalize the categorical kernel with the numerical kernel in TPE, enabling us to introduce a distance structure to the categorical kernel. Then we discuss modifications for the newly developed kernel to handle a large combinatorial search space. These modifications reduce the time complexity of the kernel calculation with respect to the size of a combinatorial search space. In the experiments using synthetic problems, we verified that our proposed method identifies better solutions with fewer evaluations than the original TPE. Our algorithm is available in Optuna, an open-source framework for HPO.


Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Sieradzki, Szymon, Mańdziuk, Jacek

arXiv.org Artificial Intelligence

In this paper we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks. In machine learning, the performance of a model heavily depends on the correct choice of hyperparameters, such as the learning rate, the number of layers in a neural network, or specific regularization techniques. These hyperparameters form a multidimensional space where some dimensions are continuous (e.g., the learning rate), while others are discrete (e.g., the number of network layers). The task of Hyperparameter Optimization (HPO) aims to find the best combination of these hyperparameters by searching this space in a way that optimizes a predefined objective function. In supervised learning, this function is usually a loss function, which quantifies an error between the predictions of the model and the true values. HPO is applicable across a wide range of machine learning models, as most optimization techniques are agnostic to the underlying model type. The core requirement for any HPO algorithm is to define the hyperparameter space and the objective function. However, HPO presents specific challenges that separate it from other optimization problems, making it a unique area in the field. Each evaluation of the objective function requires training the considered machine learning model from scratch, which is often the most time-consuming part of the optimization process. As a result, when designing HPO algorithms, the focus is less on the internal computational efficiency of the optimizer but rather on minimizing the number of objective function evaluations, while still maintaining good predictive performance.


A Goal-Driven Approach to Systems Neuroscience

Nayebi, Aran

arXiv.org Artificial Intelligence

Humans and animals exhibit a range of interesting behaviors in dynamic environments, and it is unclear how our brains actively reformat this dense sensory information to enable these behaviors. Experimental neuroscience is undergoing a revolution in its ability to record and manipulate hundreds to thousands of neurons while an animal is performing a complex behavior. As these paradigms enable unprecedented access to the brain, a natural question that arises is how to distill these data into interpretable insights about how neural circuits give rise to intelligent behaviors. The classical approach in systems neuroscience has been to ascribe well-defined operations to individual neurons and provide a description of how these operations combine to produce a circuit-level theory of neural computations. While this approach has had some success for small-scale recordings with simple stimuli, designed to probe a particular circuit computation, often times these ultimately lead to disparate descriptions of the same system across stimuli. Perhaps more strikingly, many response profiles of neurons are difficult to succinctly describe in words, suggesting that new approaches are needed in light of these experimental observations. In this thesis, we offer a different definition of interpretability that we show has promise in yielding unified structural and functional models of neural circuits, and describes the evolutionary constraints that give rise to the response properties of the neural population, including those that have previously been difficult to describe individually. We demonstrate the utility of this framework across multiple brain areas and species to study the roles of recurrent processing in the primate ventral visual pathway; mouse visual processing; heterogeneity in rodent medial entorhinal cortex; and facilitating biological learning.


Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance

Watanabe, Shuhei

arXiv.org Artificial Intelligence

Recent advances in many domains require more and more complicated experiment design. Such complicated experiments often have many parameters, which necessitate parameter tuning. Tree-structured Parzen estimator (TPE), a Bayesian optimization method, is widely used in recent parameter tuning frameworks. Despite its popularity, the roles of each control parameter and the algorithm intuition have not been discussed so far. In this tutorial, we will identify the roles of each control parameter and their impacts on hyperparameter optimization using a diverse set of benchmarks. We compare our recommended setting drawn from the ablation study with baseline methods and demonstrate that our recommended setting improves the performance of TPE.


Building a Tree-Structured Parzen Estimator from Scratch (Kind Of)

#artificialintelligence

The way a machine learning model fits itself to data is governed by a set of initial conditions called hyperparameters. Hyperparameters help to restrict the learning behavior of a model so that it will (hopefully) be able to fit the data well and within a reasonable amount of time. Finding the best set of hyperparameters (often called "tuning") is one of the most important and time consuming parts of the modeling task. Historical approaches to hyperparameter tuning involve either a brute force or random search over a grid of hyperparameter combinations called Grid Search and Random Search, respectively. Although popular, Grid and Random Search methods lack any way of converging to a decent set of hyperparameters -- that is, they are purely trial and error.